Bayesian identification of clustered outliers in multiple regression

نویسنده

Donna L. Mohr

چکیده

We propose a Bayesian model for clustered outliers in multiple regression. In the literature, outliers are frequently modeled as coming from a subgroup where the variance of the errors is much larger than in the rest of the data. By contrast, when a cluster of outliers exists, we show that it can be more informative to model them as coming from a subgroup where different regression coefficients hold. We can explicitly model the clustering phenomenon by assuming that the probability of an outlier is a function of the explanatory variables. Fitting proceeds via the Gibbs sampler, using the Metropolis–Hastings algorithm to produce variates from the more unusual distributions. Initialization uses a least median of squares fit, and in some ways this method can be viewed as a Bayesian version of the many algorithms that use this fit as a start to some more efficient estimator. This method works very well in a variety of test data sets. We illustrate its use in a data set of sailboat prices, where it yields information both on the identity of the outliers and on their location, spread, and the regression coefficients inside the minority subgroup. © 2006 Elsevier B.V. All rights reserved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A method for simultaneous variable selection and outlier identification in linear regression*

We suggest a method for simultaneous variable selection and outlier identification based on the computation of posterior model probabilities. This avoids the problem that the model you select depends upon the order in which variable selection and outlier identification are carried out. Our method can find multiple outliers and appears to be successful in identifying masked outliers. We also add...

متن کامل

Multiple-Case Outlier Detection in Multiple Linear Regression Model Using Quantum-Inspired Evolutionary Algorithm

In ordinary statistical methods, multiple outliers in multiple linear regression model are detected sequentially one after another, where smearing and masking effects give misleading results. If the potential multiple outliers can be detected simultaneously, smearing and masking effects can be avoided. Such multiple-case outlier detection is of combinatorial nature and sets of possible outliers...

متن کامل

Contents Special Issue : Selected Papers of the IEEE International Conference on Computer and Information Technology ( ICCIT 2009 ) Guest Editors : Syed

متن کامل

Practical Bayesian optimization in the presence of outliers

Inference in the presence of outliers is an important field of research as outliers are ubiquitous and may arise across a variety of problems and domains. Bayesian optimization is method that heavily relies on probabilistic inference. This allows outstanding sample efficiency because the probabilistic machinery provides a memory of the whole optimization process. However, that virtue becomes a ...

متن کامل

Penalized Trimmed Squares and a Modi- Fication of Support Vectors for Un- Masking Outliers in Linear Regression

• We consider the problem of identifying multiple outliers in linear regression models. We propose a penalized trimmed squares (PTS) estimator, where penalty costs for discarding outliers are inserted into the loss function. We propose suitable penalties for unmasking the multiple high-leverage outliers. The robust procedure is formulated as a Quadratic Mixed Integer Programming (QMIP) problem,...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Computational Statistics & Data Analysis

دوره 51 شماره

صفحات -

تاریخ انتشار 2007

Bayesian identification of clustered outliers in multiple regression

نویسنده

چکیده

منابع مشابه

A method for simultaneous variable selection and outlier identification in linear regression*

Multiple-Case Outlier Detection in Multiple Linear Regression Model Using Quantum-Inspired Evolutionary Algorithm

Contents Special Issue : Selected Papers of the IEEE International Conference on Computer and Information Technology ( ICCIT 2009 ) Guest Editors : Syed

Practical Bayesian optimization in the presence of outliers

Penalized Trimmed Squares and a Modi- Fication of Support Vectors for Un- Masking Outliers in Linear Regression

عنوان ژورنال:

اشتراک گذاری